智能论文笔记

Data Leaves: Scenario-oriented Metadata for Data Federative Innovation

Yukio Ohsawa , Kaira Sekiguchi , Tomohide Maekawa , Hiroki Yamaguchi , Son Yeon Hyuk , Sae Kondo

分类：人工智能

2022-08-07

提出了一种表示每个数据集的消化信息的方法，以创新思想的帮助以及试图使用或组合数据集创建有价值的产品，服务和业务模型的数据用户的通信。与通过共享属性（即变量）连接数据集的方法相比，此方法通过在现实世界中应活跃的情况下通过事件，情况或操作连接数据集。该方法反映了每个元数据对特征概念的适应性的考虑，这是预期从数据中获得的信息或知识的摘要；因此，数据的用户获得了适合真实企业和现实生活需求的实践知识，以及将AI技术应用于数据的基础。

translated by 谷歌翻译

Machine vision for vial positioning detection toward the safe automation of material synthesis

Leslie Ching Ow Tiong , Hyuk Jun Yoo , Na Yeon Kim , Kwan-Young Lee , Sang Soo Han , Donghun Kim

分类：计算机视觉 | 人工智能

2022-06-15

尽管化学实验室中基于机器人的自动化可以加速材料开发过程，但无监视的环境可能主要是由于机器控制误差而导致的危险事故。对象检测技术可以在解决这些安全问题方面发挥至关重要的作用；但是，包括单杆检测器（SSD）模型在内的最先进的探测器在涉及复杂和嘈杂场景的环境中的精度不足。为了改善无监视实验室的安全性，我们报告了一种新颖的深度学习（DL）基于对象探测器，即Densessd。对于检测小瓶位置的首要问题和频繁的问题，根据涉及空和溶液填充的小瓶的复杂数据集的平均平均精度（MAP）超过95％，大大超过了传统探测器的平均精度（MAP）。如此高的精度对于最大程度地减少故障引起的事故至关重要。此外，观察到致密的对环境变化高度不敏感，在溶液颜色或测试视图角度的变化下保持其高精度。致密性的稳健性将使使用的设备设置更加灵活。这项工作表明，密集是在自动化材料合成环境中提高安全性很有用，并且可以扩展到需要高检测精度和速度的各种应用。

translated by 谷歌翻译

Depth estimation of endoscopy using sim-to-real transfer

Bong Hyuk Jeong , Hang Keun Kim , Young Don Son

分类：计算机视觉

2021-12-27

为了有效地使用导航系统，诸如深度传感器的距离信息传感器是必不可少的。由于深度传感器难以在内窥镜检查中使用，因此许多组提出了一种使用卷积神经网络的方法。在本文中，通过通过CT上扫描模型分段的结肠模型通过内窥镜模拟产生深度图像和内窥镜图像的基础事实。可以使用SIM-to-Real方法使用Corpergan用于内窥镜检查图像来创建照片逼真的模拟图像。通过训练生成的数据集，我们提出了定量内窥镜检查深度估计网络。该方法代表了比现有无监督的基于培训的结果更好的评估得分。

translated by 谷歌翻译

Multisensor Data Fusion for Reliable Obstacle Avoidance

Thanh Nguyen Canh , Truong Son Nguyen , Cong Hoang Quach , Xiem HoangVan , Manh Duong Phung

分类：机器人

2022-12-26

In this work, we propose a new approach that combines data from multiple sensors for reliable obstacle avoidance. The sensors include two depth cameras and a LiDAR arranged so that they can capture the whole 3D area in front of the robot and a 2D slide around it. To fuse the data from these sensors, we first use an external camera as a reference to combine data from two depth cameras. A projection technique is then introduced to convert the 3D point cloud data of the cameras to its 2D correspondence. An obstacle avoidance algorithm is then developed based on the dynamic window approach. A number of experiments have been conducted to evaluate our proposed approach. The results show that the robot can effectively avoid static and dynamic obstacles of different shapes and sizes in different environments.

translated by 谷歌翻译

Learning to Generate Questions by Enhancing Text Generation with Sentence Selection

Do Hoang Thai Duong , Nguyen Hong Son , Hung Le , Minh-Tien Nguyen

分类：自然语言处理

2022-12-23

We introduce an approach for the answer-aware question generation problem. Instead of only relying on the capability of strong pre-trained language models, we observe that the information of answers and questions can be found in some relevant sentences in the context. Based on that, we design a model which includes two modules: a selector and a generator. The selector forces the model to more focus on relevant sentences regarding an answer to provide implicit local information. The generator generates questions by implicitly combining local information from the selector and global information from the whole context encoded by the encoder. The model is trained jointly to take advantage of latent interactions between the two modules. Experimental results on two benchmark datasets show that our model is better than strong pre-trained models for the question generation task. The code is also available (shorturl.at/lV567).

translated by 谷歌翻译

TeSS: Zero-Shot Classification via Textual Similarity Comparison with Prompting using Sentence Encoder

Jimin Hong , Jungsoo Park , Daeyoung Kim , Seongjae Choi , Bokyung Son , Jaewook Kang

分类：自然语言处理

2022-12-20

We introduce TeSS (Text Similarity Comparison using Sentence Encoder), a framework for zero-shot classification where the assigned label is determined by the embedding similarity between the input text and each candidate label prompt. We leverage representations from sentence encoders optimized to locate semantically similar samples closer to each other in embedding space during pre-training. The label prompt embeddings serve as prototypes of their corresponding class clusters. Furthermore, to compensate for the potentially poorly descriptive labels in their original format, we retrieve semantically similar sentences from external corpora and additionally use them with the original label prompt (TeSS-R). TeSS outperforms strong baselines on various closed-set and open-set classification datasets under zero-shot setting, with further gains when combined with label prompt diversification through retrieval. These results are robustly attained to verbalizer variations, an ancillary benefit of using a bi-encoder. Altogether, our method serves as a reliable baseline for zero-shot classification and a simple interface to assess the quality of sentence encoders.

translated by 谷歌翻译

Object Goal Navigation with End-to-End Self-Supervision

So Yeon Min , Yao-Hung Hubert Tsai , Wei Ding , Ali Farhadi , Ruslan Salakhutdinov , Yonatan Bisk , Jian Zhang

分类：机器人 | 机器学习

2022-12-09

A household robot should be able to navigate to target locations without requiring users to first annotate everything in their home. Current approaches to this object navigation challenge do not test on real robots and rely on expensive semantically labeled 3D meshes. In this work, our aim is an agent that builds self-supervised models of the world via exploration, the same as a child might. We propose an end-to-end self-supervised embodied agent that leverages exploration to train a semantic segmentation model of 3D objects, and uses those representations to learn an object navigation policy purely from self-labeled 3D meshes. The key insight is that embodied agents can leverage location consistency as a supervision signal - collecting images from different views/angles and applying contrastive learning to fine-tune a semantic segmentation model. In our experiments, we observe that our framework performs better than other self-supervised baselines and competitively with supervised baselines, in both simulation and when deployed in real houses.

translated by 谷歌翻译

RainUNet for Super-Resolution Rain Movie Prediction under Spatio-temporal Shifts

Jinyoung Park , Minseok Son , Seungju Cho , Inyoung Lee , Changick Kim

分类：计算机视觉

2022-12-07

This paper presents a solution to the Weather4cast 2022 Challenge Stage 2. The goal of the challenge is to forecast future high-resolution rainfall events obtained from ground radar using low-resolution multiband satellite images. We suggest a solution that performs data preprocessing appropriate to the challenge and then predicts rainfall movies using a novel RainUNet. RainUNet is a hierarchical U-shaped network with temporal-wise separable block (TS block) using a decoupled large kernel 3D convolution to improve the prediction performance. Various evaluation metrics show that our solution is effective compared to the baseline method. The source codes are available at https://github.com/jinyxp/Weather4cast-2022

translated by 谷歌翻译

FedCC: Robust Federated Learning against Model Poisoning Attacks

Hyejun Jeong , Hamin Son , Seohu Lee , Jayun Hyun , Tai-Myoung Chung

分类：人工智能

2022-12-05

Federated Learning has emerged to cope with raising concerns about privacy breaches in using Machine or Deep Learning models. This new paradigm allows the leverage of deep learning models in a distributed manner, enhancing privacy preservation. However, the server's blindness to local datasets introduces its vulnerability to model poisoning attacks and data heterogeneity, tampering with the global model performance. Numerous works have proposed robust aggregation algorithms and defensive mechanisms, but the approaches are orthogonal to individual attacks or issues. FedCC, the proposed method, provides robust aggregation by comparing the Centered Kernel Alignment of Penultimate Layers Representations. The experiment results on FedCC demonstrate that it mitigates untargeted and targeted model poisoning or backdoor attacks while also being effective in non-Independently and Identically Distributed data environments. By applying FedCC against untargeted attacks, global model accuracy is recovered the most. Against targeted backdoor attacks, FedCC nullified attack confidence while preserving the test accuracy. Most of the experiment results outstand the baseline methods.

translated by 谷歌翻译

SinGRAF: Learning a 3D Generative Radiance Field for a Single Scene

Minjung Son , Jeong Joon Park , Leonidas Guibas , Gordon Wetzstein

分类：计算机视觉 | 人工智能 | 机器学习

2022-11-30

Generative models have shown great promise in synthesizing photorealistic 3D objects, but they require large amounts of training data. We introduce SinGRAF, a 3D-aware generative model that is trained with a few input images of a single scene. Once trained, SinGRAF generates different realizations of this 3D scene that preserve the appearance of the input while varying scene layout. For this purpose, we build on recent progress in 3D GAN architectures and introduce a novel progressive-scale patch discrimination approach during training. With several experiments, we demonstrate that the results produced by SinGRAF outperform the closest related works in both quality and diversity by a large margin.

translated by 谷歌翻译